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Cyberattacks have grown steadily over the last few years. The distributed 
reflection denial of service (DRDoS) attack has been rising, a new variant of 
distributed denial of service (DDoS) attack. DRDoS attacks are more 
difficult to mitigate due to the dynamics and the attack strategy of this type 
of attack. The number of features influences the performance of the intrusion 
detection system by investigating the behavior of traffic. Therefore, the 
feature selection model improves the accuracy of the detection mechanism 
also reduces the time of detection by reducing the number of features. The 
proposed model aims to detect DRDoS attacks based on the feature selection 
model, and this model is called a proactive feature selection model proactive 
feature selection (PFS). This model uses a nature-inspired optimization 
algorithm for the feature subset selection. Three machine learning 
algorithms, i.e., k-nearest neighbor (KNN), random forest (RF), and support 
vector machine (SVM), were evaluated as the potential classifier for 
evaluating the selected features. We have used the CICDDoS2019 dataset 
for evaluation purposes. The performance of each classifier is compared to 
previous models. The results indicate that the suggested model works better 
than the current approaches providing a higher detection rate (DR), a low 
false-positive rate (FPR), and increased accuracy detection (DA). The PFS 
model shows better accuracy to detect DRDoS attacks with 89.59%. 
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1. INTRODUCTION 


Users are exploring smartphone and portable technologies in today's computing age to access 
banking, network, online shopping, retail, gaming, and media content resources using web apps over the 
internet [1]. Using online apps to navigate resources to execute such functions also expanded the number of 
users. Cybersecurity attackers develop their methodology to bring down the network or prevent legitimate 
users from using the resources or the victim network's services [2]. Therefore [3] will lead to the loss of 
business and finance. The growth line curve for cyber threatens calls for concern and thinking to find 
successful solutions to reduce the risks involved in these threats and reduce the economic impacts. 
Distributed denial of service (DDoS) is a considerable scale cyber-attack when hackers launch their attack by 
utilizing more than one attack point to produce a massive volume of malicious traffic from several sources. 
The hacker uses different instruments or programs to create a huge flood of malicious traffic with one or 
more attack vectors [4]. Denial of service (DoS) and distributed DoS attacks are primarily used by hackers to 
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disable or degrade service performance. The existence of more than one attack vector from several sources 
produces a challenge to security administrators in the intrusion detection mechanism [5]. These attacks can 
occur at the network, transport, and application level of the open systems interconnection (OSI) model. 
DDoS attacks continued to be the dominant challenge seen by the overwhelming majority of service 
providers, according to the Arbor Networks Inc. (2019) report, and the biggest attack in 2018 was 1.7 Tbps 
[6]. These DDoS attacks are further abused utilizing reflection and exploitation behavior at the application 
level employing transport level protocols. Those assaults are indicated as distributed reflection and 
exploitation-based DoS attacks (distributed reflection denial of service (DRDoS) and declarative dispersion- 
oriented software (DEDoS) [7]. DRDoS attacks are performed at the application level utilizing transmission 
control protocol (TCP), user datagram protocol (UDP), or a mixture of both. An attacker sends falsified 
demands to multiple servers with the prey spoofed source address in DRDoS attacks. In reply, replies will be 
sent to the prey by the servers. And these responses are frequently (many times) far greater than the requests 
[8]. Furthermore, DRDoS attacks appear to be more dedicated and complex with greater diversity, resulting 
in the need for a fast, intelligent and powerful cyber-attack identification system for security control for the 
frequently vulnerable network [9]. 

Securing data is very important, especially with the rapid increase of the users and devices 
connected to the Internet. All this led to the rise in the amount of data. In recent years, new types of 
cyberattacks have emerged called DRDoS attack. Fast development in those attacks and their methodology 
variety led researchers and cyber security companies to focus on detecting those attacks. Many studies were 
done to find a solution to address this issue. The classical detection methods that use static threshold are not 
suitable with the high dimension of data; therefore, we propose a new detection method based on an adaptive 
threshold for designing a proactive feature selection model. 

This paper presents a new model called a proactive feature selection (PFS) model to detect multi 
classes of DRDoS attacks then classify them. The PFS model is based on swarm optimization and 
evolutionary algorithms (SWEVO), machine learning (ML), and the fitness function is the adaptive 
threshold. The primary function of the PFS model is to reduce the number of features. Therefore the adaptive 
threshold (fitness function) will be updated every search in the population to eliminate irrelevant or 
redundant features. 

This paper aims to reveal the measures used to address issues related to the dataset and actions taken 
to enhance the detection of DRDoS attacks. In this study, the significant contributions are summarized as 
follows: i) the proposed feature selection algorithm focused on the metaheuristic optimization algorithm and 
adaptive threshold to optimize the detection mechanism performance by reducing the number of features; ii) 
the new PFS model intends to detect the DRDoS attacks and achieve this usefulness by diagnosing 
vulnerabilities in the intrusion detection system exploiting those attacks. Therefore improving detection 
accuracy. The results have proved that the PFS can detect several types of those attacks with high accuracy; 
iii) testing of the PFS model has been done, and then the results are comparing with three famous 
metaheuristic optimization algorithm (particle swarm optimization (PSO), bat algorithm (BA), and 
differential evolution (DE)). The section of results and discussion show the comparison tables. 

The CICDDo0S2019 dataset was used to test the current method's performance, reliability, and 
validity in detecting the DRDoS attack. The results indicate that the PFS model attains a high degree of 
accuracy of 89.59% detection rate in detecting several kinds of DRDoS attacks on protocols; and a drop in 
the false alarm rate as follows: i) evaluation strategy: The CICDDoS2019 dataset includes a multi-class of 
DRDoS attacks. Essential evaluation metrics include accuracy, precision, recall, Fl-score, confusion-matrix, 
and the number of features. The PFS model has been used to improve and enhance the accuracy metrics and 
reduce the number of features compared to the original dataset's number of features. The evaluation indicated 
that the PFS model achieves a high true-positive rate and a low false-negative rate. Moreover, the proposed 
model's accuracy metrics are investigated and compared with other techniques' accuracy metrics, which is 
found to be significantly higher than other models and ii) paper organization: the remainder of this paper is 
organized as follows: in section 2, the authors review the related works. Section 3 describes the swarm 
optimization algorithms and evolutionary algorithms used to develop our model also the machine learning 
algorithms used as classifiers. Section 4 describes the proposed model for feature selection. In section 5, the 
authors describe the experiment finally, our conclusion in section 6. 


2. RELATED WORK 

Earlier research used the SWEVO to improve detection and reduce false positives. Therefore, the 
number of features used plays a critical role in the quality of DRDoS detection. Sharafaldin et al. [7] 
suggested a new model that can detect several types of DDoS attacks. Moreover, it can detect various kinds 
of DRDoS attacks. The proposed model was designed using four types of machine learning algorithms which 
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are ID3, random forest (RF), naive Bayes, and multinomial logistic regression. The model is tested on the 
CICDDoS2019 dataset that contains 88 features. Table 1 shown the accuracy metrics of the model and its 
effectiveness against DRDoS attacks. The main limitation of this work is only using the classifier algorithm 
with the same number of features, therefore often contributes to the classifier's poor detection and high 
misclassification rates. 

Sharafaldin [10] suggested a feature selection model enhance the intrusion detection system (IDS) 
by using four SWEVO algorithms: PSO, grey wolf optimizer (GWO), firefly optimization (FFA), and genetic 
algorithm (GA) name. The features extracted from the suggested model are assessed based on the support 
vector machine (SVM) and J48 ML classifiers and the UNSW-NB15 dataset. The model contains 13 rules, 
and the two essential rules are R12 and R13, which are shown high accuracy and reduce the number of 
features. These lead to the enhancement of the IDS performance. The main limitation of this work is that the 
dataset user does not contain the essential attack types like DDoS attacks. Therefore, the IDS may be 
inefficient with modern cyberattacks. 

Vijayanand and Devaraj [11] proposed an approach based on the modified whale optimization 
algorithm. The improved approach's performance was evaluated using SVM, and two standard datasets are 
intrusion detection evaluation dataset (CICIDS2017) and Australian defense force academy Linux dataset 
(ADFA-LD). The selected features were the basis to identify kinds of intrusion. The informative features 
were select to help increase the accuracy of the IDS dependent on the SVM. By choosing the informative 
features with the enhanced whale optimization algorithm, the efficiency of the IDS was improved. The 
identification ratio for attacks was better than that of the regular whale optimization algorithm (WOA). 

Ghasemi et al. [12] proposed a new hybrid model that builds on GA and four classification 
algorithms. This model is called kernel extreme learning machine (KELM) for feature selection in IDS. By 
using network security layer-knowledge discovery in database (NSL-KDD) standard datasets that is an 
enhanced version of the KDD CUP 99 dataset, the performance of the KELM model was evaluated. Through 
the implementation of the KELM, a new dataset is produced called GA-dataset. The KELM enhanced by GA 
on the GAdataset achieved high accuracy and low false alarm. 

Sarvari et al. [13] suggested a new feature selection approach called mutation cuckoo fuzzy (MCF) 
to select the optimal features. For the purposed of classification, multiverse optimizer-artificial neural 
network (MVO-ANN) is used. The suggested search algorithm utilizes a mutation to examine the search 
space more accurately. The validation of the performance and relevance MFC model for IDS problems uses 
the NSL-KDD standard datasets. 

Patil and Kshirsagar [14] suggest a system based on feature selection to detect DDoS attacks. 
Information Gain has applied the process of feature selection with the Ranker algorithm. The proposed 
method uses RF, J48, and logistic model tree (LMT) classifiers to detect DDoS attacks. With the assistance 
of the CICIDS2017 dataset, the suggested system has been tested. The outcome of the experiments reveals 
that the J48 classifier has major features with an increased detection performance relative to Random Forest 
and LMT. The main limitations of previous research works are due to the use of static threshold with the high 
dimensional dataset, and some of the studies used only machine learning algorithms when proposed a new 
model for feature selection. We propose a new proactive feature selection model based on an adaptive 
threshold to enhance the detection accuracy rate to address these shortcomings. 


3. SWARM OPTIMIZATION EVOLUTIONARY ALGORITHMS (SWEVO) AND MACHINE 
LEARNING ALGORITHMS MACHINE LEARNING (ML) 

The proposed mechanism to detect the DRDoS attacks is based on two swarm optimization 
algorithms and one evolutionary algorithm besides three machine learning algorithms as classifiers. Machine 
learning-based IDSs can reach satisfactory detection levels, and machine learning models have sufficient 
generalizability to detect attack variants and novel threats. The promising research area in computer science, 
derived from SWEVO algorithms, is motivated by the natural evolution of biological organisms [15]. Many 
heuristic algorithms obtained from the natural behavior of biological or physical systems were suggested as 
robust methods for global optimization [16]. Cybersecurity challenges have been commonly applied to 
machine learning techniques. ML combines statistics and artificial intelligence with learning a data model 
[17], [18]. Cybersecurity ML techniques effectively suggest the correct decision for analysis and even 
automatically perform the appropriate response [19]. Thus, we can also differentiate between supervised, 
semi-supervised, and unsupervised approaches [20], [21]. 


3.1. Particle swarm optimization (PSO) 


Kennedy and Eberhart presented the PSO in 1995. It gives a unique mechanism to imitate swarm 
behavior in flocking birds and fish schooling to direct the particles searching for optimal global solutions 
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[22]. PSO is a common swarm intelligence algorithm employed in the continuous search space to solve 
global optimization [23], [24]. 


3.2. A new metaheuristic bat-inspired algorithm (BA) 

Xin-She Yang presented the BA in 2010. The bat algorithm relies on the bat behavior, which is 
based on echolocation. The bats employ this feature to determine prey's location and distinguish several bugs 
even in absolute darkness [25]. 


3.3. Differential evolution (DE) 

A simple and efficient heuristic for global optimization over continuous spaces. The DE algorithm 
was presented by Storn and Price in 1997 and belonging to the family of evolutionary computation 
algorithms that apply biologically inspired one of the search algorithms. Moreover, this algorithm based on 
population also utilizes three operators utilized in the new heuristic algorithm: mutation, crossover, and 
selection. This method is robust, simple to use, and well-suited for parallel computing because fewer control 
variables are required [26], [27]. 


3.4. K-nearest neighbor (KNN) 

The KNN is a lazy learning algorithm aimed at classifying a new object based on the current classes 
in which the previous training points are categorized. It categorizes the latest data points based on metrics of 
resemblance [24], [28]. 


3.5. Support vector machine (SVM) 

The SVM technology offers the best approach for classifying clean and invasive data forms. High- 
class precision in detecting data intrusions is solved by SVM technologies [29]. Initially, SVM is an 
application of the systemic risk minimization (SRM) concept of Vapnik, which is considered to have a low 
generalization error or is not necessary to overfit the training data collection [30]. 


3.6. Random forest (RF) 

A machine learning algorithm that incorporates two principles of decision tree and ensemble 
learning is a RF. RF achieves high accuracy in the detection and can accommodate outliers and data noise 
[31]. The RF Algorithm focuses on creating many decision trees, each of which acts as a classifier. The 
outcome of the final decision is decided by the balloting of all decision trees [32]. 


4. PROPOSED PROACTIVE FEATURES SELECTION MODEL (PFS) 

Our previous work [33] provided a critical review paper of the designed and implemented 
mechanisms to detect DRDoS attacks. PFS is inspired by the principle of adaptive system parameters 
dynamically with changes in the algorithm's behavior. Dynamic systems often prefer static because they have 
significant performance, are flexible, and are suitable for solving many problems [x1]. The probability of 
finding the optimal solution and enhance stagnation on local optimum in a dynamic system is high [x2]. 
Metaheuristic has been recognized as fast, flexible, easy to implement, and successful in optimizing different 
fields [x3]-[x5]. The limitations of several metaheuristics models are often suffering from stagnation during 
the search process. Therefore, the applied dynamic system will reduce the stagnation of the local optimum 
and enhance the performance of systems that use metaheuristic algorithms. The wrapper model is a 
compelling feature selection method [x6]. It selects features based on trial and error, then updates 
corresponding features after each iteration. 

Furthermore, the features are selected randomly. The wrapper models select features based on 
ranking given by metaheuristic techniques to these features. Generally, it selected only features 
corresponding to rank higher than the threshold is a fixed value often set by 0.5. The proposed system sets 
the threshold dynamically; it has been changed adaptively with search progress. Figure 1 shown the graphical 
abstract of the proposed model and Figure 2 illustrates the main steps of the proposed system. 


4.1. Data preparation stage 

Data preprocessing: the first stage is to convert raw data into an analysis-ready format by 
implementing preprocessing to the CICDDoS2019 datasets. There are several procedures for data 
preprocessing to be performed: i) import the dataset into python IDE; ii) search for incomplete data and 
outliers; iii) eliminate the data noise from the preprocessing; and iv) split the data used to build the model 
into training and testing collection. 
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Data normalization: the mechanism by which the data values of each function are converted or 
scaled into a proportional set. According to (1), the used dataset was normalized to the range [0, 1]; 
normalizing the data is important in eliminating the biased features of greater values for the dataset. We used 
20% of the original dataset CICDDoS2019, and it consists of two parts: 70% train size and 30% test size. 


X- Xmin 


X ized = M 
normalized Xminmax (1) 


Where min is the minimum value in feature, max is the maximum value in feature. 
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Figure 1. The process design of the proposed PFS model 
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Figure 2. Flowchart of the proposed PFS model 


4.2. Enhance feature selection by using the PFS model 
A challenging issue is the selection of features. When the dimensionality of the feature is high, the 
choice of the appropriate features is crucial. To address this, SWEVO metaheuristic algorithms are most 
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suited. Centered on the PSO, BA, and DE algorithms, three subsets were derived from the proposed model. 
The proposed system used dynamic behavior for setting the value of 0 that selected only features that 
corresponding to rank higher than the threshold to select an optimal value to the 0. Equation (2) calculates 
the 0 for wrapper models, 


A@ = 0Max — 0Min (2) 
where @Max and 0Min can be determined by the user 


a’ = 6Min + A0 i current Iter 2% 
= x (1 — (———— x 
a ae MaxlIter ) (3) 

where 0Min and 40 can be computed from (2), current Iter, MaxIter, À is a random variable in the 
interval [-1, 1]. 

In the initial stage of the search process, the system needs to apply as high of 0 as it is possible to 
restrict to features with high-rank values only. The value of @ is reduced during search progress. Figure 3 
shows the probability of 6 during search iterations. 
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Figure 3. Probability of 0 


Optimal feature selection is influenced by the appropriate 0 value chosen based on the initial value 
of 0. Therefore, the upper and lower bound of the period range of 8 is not constant and can be changed by the 
user based on the resulting quality of the model; therefore, the user can specify the range or period within 
which the @ is located. For the above reason, we used the adaptive threshold because the initial 0 represents 
the threshold. We set an adaptive threshold with an initial value used to distinguish between normal and 
abnormal behaviors. The user can change this adaptive threshold value based on the result and expand the 
search space if the results do not satisfy and set an acceptable threshold is not straightforward. The adaptive 
threshold is more useful than other thresholds because its ability to adapt to the changes that may occur in 
network traffic during the attack and set an initial value of the threshold has become challenging. 


5. EXPERIMENTS 

The data collection used along with the data pre-processing protocol is presented in detail in this 
section. We also provide the metrics of performance used in our experiments. Furthermore, we show our 
model's architecture. Finally, we provide a comparative analysis of our model and that of various classifiers. 
All experiments were performed on a 2.90 GHz, i7, 16 GB RAM, and Windows 10 pro-64 bits operating 
system. PyCharm ide python and python 3.8 are used to execute our model. The total number of features in 
the dataset CICDD0S2019 [7] is 88. We have used 20% per cent from the dataset based on each attack's time 
in the original dataset. When the researchers designed the PFS model, they focus on enhancing the detection 
accuracy and reducing the number of features; these two factors are very important when designing a new 
approach for the detection mechanism. The metrics used are based on its ability to categorize network traffic 
into a correct category; IDS efficiency is measured. 
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Algorithm 1. A proactive features selection model (PFS) 
Input: Initialization Parameters" 

Output: Optimal Features" 

1. theta-0.5//initially set the threshold (theta=0.5) 


2. segma-0//segma is a stagnation sensitive parameter 

3. while iter < Max — Iteration do 

4 Update population according to optimiaztion method 
Se Localbest-find (best) 

6 if Globalpest<Localpbest then 

7 Globalpbest-Localbest 

Bos segma-0 

OS else 

10:3 segma-segmat+l 

11s if segma<2*populationsize then 

124 segma 0 

13° theta-select, randomaly, max-min, eqx 


14. return Optimal Features 


Accuracy 
i 2 T Pror Each Attack Class + T Ngenign 
2 T Pror Each Attack Class + 2 FNror Each Attack Class t T Ngenign + F Pgenign (4) 
Precision = x T Pror Each Attack Class 
X(T Pror Each Attack Class» FPgenign) (5) 
Recall = Py T Pror Each Attack Class 
È (T Pror Each Attack Class’ F Nror Each Attack Class) (6) 
2 x (Recall * Precision) 
F1 scor = 


X (Recall + Precision) (7) 


5.1. Performance analysis (results and discussion) 

The proposed PFS model results have been compared with the model's results in the base paper [7] 
based on the accuracy metrics shown in Table 1. The results had proved that the PFS model is better than the 
model in the base paper, depending on the accuracy measures. Also, the selected dataset was tested on both 
the machine learning algorithms such as KNN, RF, and SVM without the PFS model, SWEVO metaheuristic 
algorithms such as PSO, BA, and DE without PFS. Moreover, reducing the number of features selected has 
been done. 


Table 1. Results of the performance test for the reference [7] 


Algorithms Precision Recall __F1 Score 
Decision Tree ID3 0.78 0.65 0.69 
Random Forest 0.77 0.56 0.62 
Naive Bayes 0.41 0.11 0.05 
Multinomial Logistic Regression 0.25 0.02 0.02 


Table 2 shows that when running the KNN only and then PSO_KNN without PFS, BA_KNN 
without PFS, and DE_KNN without PFS. We implement the PFS model with the three previous models. The 
result indicates that the PFS model shows that the accuracy achieved is better than that of other models than 
PFS. The number of features was reduced when running PFS, and the details are also shown that: the 
PFS_PSO_KNN the number of features is 19 features, the PFS_BA_KNN the number of features is 34 
features, and the PFS_DE_KNN the number of features is 45 features. The PFS_PSO_KNN is better than the 
other two proactive models PFS_BA_KNN and PFS_DE_KNN, in terms of accuracy and number of features. 

Table 3 shows that when running the Random Forests RF only and then PSO_RF without PFS, 
BA_RF without PFS, and DE_RF without PFS. Then we implement the PFS model with the three previous 
models. The results indicate that the PFS shows that the accuracy achieved is better than that of other models 
than PFS. The PFS_PSO_RF reduced the number of features to 49 features while the PFS_BA_RF reduced 
the number to 41 features, and finally, the PFS_DE_RF had reduced the number of features to 53 features. 
The PFS_DE_RF is better than the other two proactive models PFS_PSO_RF and PFS_BA_RF, in terms of 
accuracy and other accuracy metrics such as precision, recall, and F1 score. 
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Table 2. PFS model performance and KNN with the three SWEVO algorithms relative to the 
different attack types 


DRDo DRDo DRDo DRDoS_ DRDo Normal Accuracy Precision Recall F1 Number 
S_DNS S_LD S_Net SSDP S_UDP score of 
AP BIOS features 
KNN 
89.79 78.77 96.31 66.77 99.8 99.61 81.94 80.64 80.66 80.65 88 
PSO_KNN without PFS 
89.02 73.47 97.46 85.58 99.8 99.54 85.23 84.97 84.6 84.78 45 
PFS_PSO_KNN 
91.43 79.55 97.21 89.79 98.48 99.45 89.59 90.04 89.64 89.84 19 
Model BA_KNN without PFS 
89.1 82.75 96.47 74.07 99.81 99.71 84.62 84.53 83.75 84.14 37 
PFS_BA_KNN 
91.27 82.67 96.99 87.39 98.47 99.53 89.56 90.28 89.54 89.91 34 
DE_KNN without PFS 
93.55 81.22 96.73 72.96 99.8 99.57 85.11 83.97 84.22 84.09 65 
PFS_DE_KNN 
91.8 81.12 97.13 84.38 99.78 99.57 88.27 88.71 87.85 88.28 45 


Table 3. PFS model performance and RF with the three SWEVO algorithms relative to the different attack 
types and benign 


DRDo DRDo DRDo DRDoS_ DRDo Normal Accuracy Precision Recall F1 Number 
S_DNS S_LD S_Net SSDP S_UDP score of 
AP BIOS features 
RF 
81.89 76.04 92.69 93.19 95.97 95.49 83.07 84.91 83.49 84.2 88 
PSO_RF without PFS 
84.74 80.61 94.46 90.63 97.67 97.17 85.78 86.74 86.16 86.45 46 
PFS_PSO_RF 
86.69 83.26 96.43 91.82 99.74 99.15 87.89 88.61 88.3 88.45 49 
Model BA_RF without PFS 
86.73 80.91 96.41 93.58 99.75 99.32 87.63 89.03 88 88.51 37 
PFS_BA_RF 
86.65 82.07 96.44 92.08 99.74 99.13 87.69 88.37 88.08 88.22 41 
DE_RF without PFS 
81.03 77.8 90.72 86.13 94.08 93.54 82.13 82.61 82.52 82.57 59 
PFS_DE_RF 
86.65 83.44 96.43 92.11 99.71 99.19 87.98 88.76 88.39 88.57 53 


Table 4 shows that when running the SVM only and then PSO_SVM without PFS, BA_ SVM 
without PFS, and DE_SVM without PFS. Then we implement the PFS model with the three previous models. 
The results indicate that the PFS shows that the accuracy achieved is better than that of other models than 
PFS. The PFS_PSO_SVM reduced the number of features to 48 features while the PFS_BA_SVM reduced 
the number to 30 features, and finally, the PFS_DE_SVM reduced the number of features to 45 features. The 
PFS_BA_SVM is better than the other two proactive models PFS_PSO_SVM and PFS_DE_SVM, in terms 
of accuracy and number of features and accuracy metrics precision, recall, and flscor. 

Figure 4 shows that the accuracy line curve, although in the early iterations, PSO_KNN without PFS 
seems to be performing better in terms of detection accuracy, after iteration 11, PSO_KNN with PFS yields a 
higher detection accuracy rate. The final accuracy rate achieved by PSO_KNN with PFS stands at 89.59% 
compared to the one without PFS at 85.23%. Figure 5 shows that the accuracy line curve, although in the 
early iterations, BA_KNN with PFS, seems to be performing better in detection accuracy from the first 
iteration. It yields a higher detection accuracy rate. The final accuracy rate achieved by BA_KNN with PFS 
stands at 89.56% compared to the one without PFS at 84.62%. Figure 6 shows that the accuracy line curve, 
although in the early iterations, DE_KNN with PFS, performs better in detection accuracy from the first 
iteration. It yields a higher detection accuracy rate. The final accuracy rate achieved by DE_KNN with PFS 
stands at 88.27% compared to the one without PFS at 85.11%. 

Figure 7 shows that the accuracy line curve, although in the early iterations, PSO_RF without PFS 
seems to be performing better in detection accuracy; after iteration 5, PSO_RF with PFS yields a higher 
detection accuracy rate. The final accuracy rate achieved by PSO_RF with PFS stands at 87.89% compared 
to the one without PFS at 85.78%. Figure 8 shows that the accuracy line curve, although in the early 
iterations, BA_RF without PFS seems to be performing better in detection accuracy; after iteration 161, 
BA_RF with PFS yields a higher detection accuracy rate. The final accuracy rate achieved by BA_RF with 
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PFS stands at 87.69% compared to the one without PFS at 87.63%. Figure 9 shows that the accuracy line 
curve, although in the early iterations, DE_RF with PFS seems to be performing better in detection accuracy 


from the first iteration, and it yields a higher detection accuracy rate. The final accuracy rate achieved by 
DE_RF with PFS stands at 87.89% compared to those without PFS at 82.13%. 


Table 4. PFS model performance and SVM with the three SWEVO algorithms relative to the 
different attack types and benign 


DRDo DRDo DRDo DRDoS_ DRDo Normal Accuracy Precision Recall F1 Number 
S_DNS S_LD S_Net SSDP S_UDP score of 
AP BIOS features 
Model SVM 
80.05 74.23 81.47 83.51 72.83 93.76 70.3 73.31 72.13 72.72 88 
PSO_SVM without PFS 
83.13 77.29 85.08 86.83 76.08 96.85 73.61 76.85 75.46 76.15 31 
PFS_PSO_SVM 
85.39 79.6 87.48 89.04 78.38 99.15 75.92 79.14 77.78 78.41 48 
BA_SVM without PFS 
85.4 79.55 87.64 88.92 78.43 99.15 75.91 79.18 71.78 78.47 35 
PFS_BA_SVM 
85.35 81.4 88.76 81.99 82.56 99.15 76.38 79.45 77.77 78.6 30 
DE_SVM without PFS 
85.4 79.63 87.83 88.92 78.5 99.15 76 79.23 71.87 78.54 46 
PFS_DE_SVM 
85.4 79.62 87.7 89.04 78.47 99.18 76.01 79.23 77.88 78.55 45 
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Figure 10 shows that the accuracy line curve, although in the early iterations, PSO_SVM with PFS, 
seems to be performing better in detection accuracy from the first iteration. It yields a higher detection 
accuracy rate. The final accuracy rate achieved by PSO_SVM with PFS stands at 75.92% compared to the 
one without PFS at 73.61%. Figure 11 shows that the accuracy line curve, although in the early iterations, 
BA_SVM with PFS, seems to be performing better in detection accuracy from the first iteration. It yields a 
higher detection accuracy rate. The final accuracy rate achieved by BA_SVM with PFS stands at 76.38% 
compared to the one without PFS at 75.91%. Figure 12 shows that the accuracy line curve, although in the 
early iterations, DE_SVM without PFS seems to be performing better in detection accuracy; after iteration 
178, DE_LSVM with PFS yields a higher detection accuracy rate. The final accuracy rate achieved by 
DE_SVM with PFS stands at 76.01% compared to those without PFS at 76%. 
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6. CONCLUSION 

The number of features in the dataset influences the detection mechanism performance; therefore, 
reducing the number of features is necessary to improve the detection accuracy rate. In the PFS model, we 
use an adaptive threshold to enhance accuracy of detection by distinguishing normal from abnormal in the 
dataset. A few models are suggested to detect the DRDoS attacks, but some failed, or the detection accuracy 
rate is very low; for those reasons, the authors are suggested the new model PFS that can detect the DRDoS 
attacks. The PFS model is based on optimization algorithms and classifiers machine learning algorithms. The 
tables of comparisons and figures of accuracy line curve prove that the PFS model is the best never to detect 
the DRDoS attacks with high true positive rate and low false-negative rate. Our future work will focus on 
enhancing the detection rate and minimizing the false alarm rate by using other techniques such as clustering 
or neural networks. 
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